RANDOM SUBGRAPHS OF THE 2D HAMMING GRAPH: 
THE SUPERCRITICAL PHASE 



REMCO VAN DER HOFSTAD AND MALWINA J. LUCZAK 

00 ' 

«) ■ Abstract. Wc study random subgraphs of the 2-dmicnsional Hamming graph H{2,n), which is 

, the Cartesian product of two complete graphs on n vertices. Let p be the edge probabihty, and 

CN ' write p = 2{n'-i) some £ e M. In [H [S], the size of the largest connected component was 

O ■ estimated precisely for a large class of graphs including H{2,n) for e < AV~^/^, where A > is 

a constant and V = n'^ denotes the number of vertices in H(2,n). Until now, no matching lower 
bound on the size in the supercritical regime has been obtained, 
ly^ I In this paper we prove that, when e ^ {logVy^^V~^^^ , then the largest connected component 

has size close to 2£V with high probability. We thus obtain a law of large numbers for the largest 
connected component size, and show that the corresponding values of p are supercritical. Barring 
■ the factor {logVY^^, this identifies the size of the largest connected component all the way down 

p ^ I to the critical p window. 

» 

a 
^ ■ 

1. Introduction 

CN ■ We study random subgraphs of the 2- dimensional Hamming graph H(2,n). The rf-dimensional 

. Hamming graph is a graph on V = n'^ vertices, each corresponding to one of the n'^ distinct d- 

' vectors v = {vi, . . . ,Vd) G {1, . . . , n}'^. A pair of vertices are connected by an edge if and only 

^ ■ if these vertices differ in precisely one coordinate. (See for example [9] for more information on 

. • the properties of Hamming graphs.) The 1-dimensional Hamming graph H{l,n) is the complete 

^ . graph; for d>2, the graph H{d,n) is the Cartesian product of d complete graphs on n vertices. 

00 [ In particular, it is transitive and the degree of each vertex is = d{n — 1). 

O ■ We write Pp for the probability law of the random subgraph of G resulting when each edge is 

^ ■ occupied (or present) with probability p, and vacant (or absent) with probability 1—p, independently 

^ ', of all the other edges. We write Ep for the expectation with respect to Pp. Also, Varp will denote 

H ■ the variance under P, 



p- 



Throughout we work with the 2-dimensional Hamming graph H{2,n) unless explicitly stated 
otherwise, and we shall assume that p = 2(n-i) ~ where e = e{n) G (0, 1) tends to in a 
certain way to be specified below. Our goal is to study properties of random subgraphs of H{2, n) 
under Pp. 

Random subgraphs of finite tori with various edge sets were studied in quite some generality in 
in [5], and we now highlight the key results of these papers. Some of the theorems in [U [5] apply 
to a general finite transitive graph, which in what follows will be denoted by G. We also denote 
the number of vertices or volume of G by ^ = |G| and the vertex degree by VL. Given a vertex v 
of G, we shall write C(v) for the connected component or cluster containing v, and |C(v)| for the 
number of vertices in C(v). Further, we let x{p) be the expected size of the cluster containing v. 
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that is 

x(p) = E,[|C(v)|]. (1.1) 

(Note that, by transitivity, this is independent of the choice of v.) Then in [U [5] the critical 
threshold Pc = Pc('G, A) of a finite transitive graph G is defined to be the unique solution to the 
equation 

xiPc) = XV'/\ (1.2) 
where A > is a sufficiently small constant. (See [5] for details concerning the precise constraints 
on the size of A.) 

In |1] , cluster sizes were investigated for graphs G satisfying the so-called triangle condition. In 
[5], the triangle condition was established for certain types of graphs G, including the Hamming 
graph H{d,n) of a general dimension d > 1. We shall now describe these results briefly in order 
to set up our own scene. 

Let Cjnax denote a cluster of maximum size, where we may pick any such cluster if it is not 
unique. Then |Cmax| is the maximum cluster size, that is 

IC^axI = max{|C(v)| : vgG}. (1.3) 

The main theorems in [1] concern the scaling of x{p) ^'^^ bounds on |Cniax| in graphs G satisfying 
the triangle condition as |G| = V ^ oo. Specifically, it is shown in |H [3] that, if Pc is as in fll.21) 
and 

P = Pc + ^, (1.4) 
then, for all e such that eV^^'^ — > —oo, asymptotically the expected cluster size x{p) satisfies 

x{p) = ^ — n — ^ -■ (1-5) 



With regard to the maximum cluster size, for all > 1, as K — > oo, 

Pp(^ - l^-^^l - ^x'ipnogiV/x'ip))) > (1 + ^^^y - V~e[2\og{V/x'{p)r'^'. (1-6) 

The above describes the behaviour of the mean and maximum cluster sizes for suhcriticalp values, 
which are p values satisfying eV^^'^ — oo; in particular, the bounds apply to H{2,n). 

For a constant A > 0, the critical window is defined as the interval of all p = Pc + such that 
\e\ < AV~^^^. Theorem 1.3 in [1] shows that, for some constant b = b{A), the maximum cluster 
size inside the critical window satisfies 

The corresponding results in [11[5] are significantly weaker in the case p = Pc + ^ where e^V oo 
(that is, when p is above the critical window or supercritical) . In particular, only upper bounds on 
the maximum cluster size are established therein. More precisely, it is proved in |1] that, for all 
a; > 1, 

/ \ 21 

K[\Cm..\>ioiV^/^ + eV)) < — . (1.8) 

The problem with this result is that it does not imply that pc as defined in (II. 2p actually is the 
critical value, and thus that p = Pc + with e^V — ^ oo really is above the critical window. Indeed, 
to prove that this is the case, one additionally needs a lower bound on the maximum connected 
component size. No such results are established in [U E], and we expect that the geometry of the 
graphs under consideration plays a crucial role in lower bounding the largest cluster size. 
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The aim of this paper is to estabhsh the asymptotics of the maximum supercritical cluster for 
the 2- dimensional Hamming graph H{2,n). Throughout our proofs we shall use the phrase "with 
high probability" (abbreviated as "whp") to mean "with probability tending to 1 as V" ^ oo". 
Also, "with very high probability" (abbreviated as "wvhp") will mean "with probability at least 
1 — 0{V~^) a.s V —>■ oo". All unspecified limits are as V oo. Given an event E, I[E] will 
denote the indicator of E. We write P(-) for a generic probability measure (for instance, the 
probability measure corresponding to a sequence of i.i.d. binomial random variables), which may 
vary from situation to situation. We use the Op and Op notations in the standard way (see e.g. 
Janson, Luczak and Rucihski [18]). For example, if (Xn) is a sequence of random variables, then 
Xn = Op(l) means "X„ is bounded in probability" and Xn = Op(l) means that Xn converges 
to zero in probability as — ^- oo. We shall also use the asymptotic o(), 0(), f2(), 0() notations 
(without the subscript "p") in the standard way, and again referring to the regime where ^ ^ oo. 
We write f{V) > g{V) (resp. f{V) <C g{V)) when g{V) = o{f{V)) (resp. f{V) = o{g{V))) as 

V — > oo. Throughout, the symbol "~" refers to, often heuristic, estimates of the leading order as 

V ^ oo, with unspecified constants and thus uncontrolled error terms. Finally, we denote by C a 
generic (unspecified) positive constant, which may change from line to line. We shall interchange 
this use of C with the 0{) notation. 

1.1. The model. We consider the Hamming graph H{d,n), and take the edge probability p = 
{1 +e)/Q. We first argue that this agrees asymptotically with the choice of p in (11.41) . Let us note 
that [H Theorem 1.5] establishes that, for a graph G satisfying the triangle condition, 

1 - xiPc)-' <np,<i- xiPc)-' + 0{n-'). (1.9) 

When G = H{d,n), then = d{n - 1) and xiPc) = A\/^/^ = Xn'^^^. Therefore, if e = Q{V-^/^), 
then 

P=^=Pc+ ^(1 + 0(1)), (1.10) 
while for p outside the critical window, 

P=^=Pc + ^(1+0(1)). (1.11) 

Since in the case d = 2, we have that ^2"^ = o(e/f2) for e ^ V~^^^, the critical value defined 
in [H agrees asymptotically to leading order with the value l/d{n — 1) = In particular, 

p = l/d{n — 1) is inside the critical window of [HH]. This shows that we are working in the correct 
range of p values. For d > 3, fll.lOp - fll.lip may not necessarily be valid, and we shall discuss this 
issue in more detail in Section n~2[ 

From now on, we concentrate on the supercritical case, that is e ^ V~^^^ = n~'^^^. Our main 
result is the following: 

Theorem 1.1 (The supercritical phase for H{2,n)). Consider the 2- dimensional Hamming graph 
H{2,n). Letp = pc + fi andletV-^'^i\ogVYI^<^e<^l. Then 

IC^axI =25^2(1 + Op(l)). (1.12) 

Theorem 11.11 shows that, when ?7.~^/^(logr2)^/'^ ^ e ^ 1, the largest connected component satisfies 
a law of large numbers. Barring the factor (log VY^^ in the lower bound on e. Theorem 11.1 1 identifies 
the asymptotic size of the largest cluster all the way down to the critical threshold. Therefore, 
our result demonstrates that pc = 2{n-i) I'sally is the critical value for random subgraphs of the 
2-dimensional Hamming graph. We believe that our proof can be adapted to deal with the case 
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where £ > is fixed. Here, the corresponding statement would be that |Cmax| ~ Ci+e^ whp, where 
(x is the survival probability of a Poisson branching process with mean offspring A. Since the proof 
of Theorem 11.11 is the most challenging for e as close as possible to the critical window V^^^^, we 
choose not to consider the case of constant e in this paper. 

Before giving a proof of Theorem ll.il we discuss its statement in more detail in Section [L2] below. 
Therein we also include some conjectures concerning Hamming graphs of a general dimension d. 

1.2. Discussion and heuristics. We first sketch an intuitive picture justifying the definition of 
Pc from [U [5] given in (II. 2p . This picture relies on a branching process approximation for p < p^. 

We expect random clusters in our model to exhibit behaviour similar to that of a subcritical 
branching process. Therefore, from the theory of branching processes, if p = Pc + ^ is just below 
the critical point (for instance, if e < 0), then we should have (e.g., from the Otter-Dwass formula, 
see Lemma [3.41 below) 

Pp(|C(v)| >k)^ 1 g-|fc.2(i+o{i))^ ^^^^3^ 

V K 

which in turn implies that 

poo r^^^ 
X(p) = Ep[|C(v)|] ~ / x-^/'e-^'^'^^+^^i^^dx ~ / x-^l^dx^e-\ (1.14) 
Jo Jo 

Thus, in fact, 

Pp(|C(v)|>A;)~i=e-^''^'\ (1.15) 
and hence for subcritical p (possibly up to logarithmic corrections) 

|Cmax| ~ Xipf whp. (1.16) 

On the other hand, in the case p > Pc there should be a connected component dominating all 
the others in size. One way to express this intuitive statement is to impose that 

Xip) = E,[|C(v)|] ~ E4|C(v)|J[v G C^ax]] = ^^p[\Cm..n (1.17) 

Naturally, the meaning of formula (11.171) is, in essence, that the main contribution to the expected 
size of a cluster of any particular vertex v is from those configurations where this vertex lies in 
the largest component. 

Note that (11.171) could be taken as a defining property of supercritical behaviour. Then the 
critical window can be defined as the interval of p values where the subcritical and supercritical 
pictures coincide. In other words, if p lies within the critical window then both (11.161) and (11.171) 
should be satisfied. 

Assume further that a sufficient amount of the concentration of measure exhibited by |Cmax| 
in the subcritical regime (as implied by (11.161) ) carries through to the critical window, so that 
IEp[|CmaxP] ~ x(p)^- It then follows that, for p inside the critical window, 

xip) - ^E,[|C„,axP] - ^xipV; (1.18) 
and hence, inside the critical window, we are led to 

X{p) ~ V'/\ (1.19) 

This provides a rationale for the definition (11.21) of the critical threshold p^. In conclusion, the 
above heuristic demonstrates that branching process approximations in the subcritical regime and 
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the domination of the expected cluster size by the maximum cluster size in the supercritical regime 
together imply that (11 ■2p is the "right" definition for pc- 

At this point, we emphasise that subcritical branching process approximations are only likely 
to be valid for a random graph that is sufficiently mean-field in character, in the sense that its 
geometry is of little significance for the structure of its random subgraphs. This is the case for 
sufficiently high-dimensional random graphs, but cannot be expected to hold for low-dimensional 
random graphs, as indicated in [3, E] . For random subgraphs of the torus with nearest-neighbour 
bonds in a sufficiently high (but constant) dimension, as well as for the torus with sufficiently 
spread-out bonds in dimensions greater than 6, it is shown in |T2] that the largest critical con- 
nected component is of order V"^^^, with logarithmic corrections in the lower bound. Accordingly, 
assuming universality in high- dimensional finite-range percolation, one can expect classical ran- 
dom graph asymptotics at criticality to be valid for random subgraphs of the torus when > 6 for 
general choices of finite-range edges. On the other hand, the results of [71 [8] suggest that random 
graph asymptotics at the phase transition threshold are not valid for random subgraphs of the 
(i-dimensional torus when d < 6. 

We close this section with a few comments and conjectures. The present paper verifies the 
location of the critical window found in [U [S] up to a factor {logVY^^. The main barrier to 
overcoming this separation with our approach is the fact that we require concentration of measure 
for the number of vertices with either the first or second coordinate fixed in order for our estimates 
to be sufficiently precise; this concentration property fails when e is too small. Similar issues cause 
problems with extensions of our approach to H{d, n) for d > 2, although we believe that it could 
handle the d = 3 case. Let us mention at this point that the {logVY^^ separation has since been 
removed by Nachmias [22], using non-backtracking random walks. He also manages to nail down 
the critical window from above in H{3,n), although he does not establish laws of large numbers 
for the giant component for H{2,n) or H{3,n). 

In the present paper we investigate the scaling of the largest connected component in supercrit- 
ical percolation on the Hamming graph H{2,n). Many random graph models are well known to 
satisfy what is sometimes referred to as a discrete duality principle (see for instance [H Section 
10.5]). This is the principle that the size of the second largest supercritical component is asymptot- 
ically close in distribution to the size of the largest subcritical component for an appropriate choice 
of subcritical edge probability. This notion of duality is closely related to the duality exhibited by 
branching processes [21 [101 [HI [IS] , or [H Section 10.4]. We expect the Hamming graph H{2,n) 
to follow the discrete duality principle. More precisely, we expect that if we were to remove the 
largest connected component when p = pc + e/^l with e ^ V~^^'^, then the resulting connected 
components would be like those of the Hamming graph with p = pc — e/Q. In particular, letting 
|C(2)| be the size of the second largest component, we conjecture that 

|C(,)| = 2e"2iog(e3y)(i + Op(l)). (1.20) 

For the Hamming graph H{d, n) of an arbitrary dimension we conjecture that critical p values 
are of the form 

p=5^a,n- + -^, (1.21) 

where A is an arbitrary constant, and the coefficients Oj = ai{d) are independent of n. Note that 
Pc = 1/f^ = l/d{n — 1) corresponds to Oj = ai{d) = 1/d for all i > 1, while 

Pc = l/(fi — 1) = l/{d{n — 1) — 1), where d{n — 1) — 1 is the forward branching ratio of H{d,n), 
corresponds to = ai{d) = {d + ly/d^"^^ for all i > 1. We believe that, when d is sufficiently 
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large, there exists an i such that ai{d) ^ 1/d and ai{d) ^ {d + iy/d^~^^. In particular, if this is 
indeed true, then, for e = Q{V~^^^) and d sufficiently large, the edge probability p = pc + is not 
the same as p = or p = ^-^j. For d = 2, however, these choices do asymptotically agree, as 
explained in ^- (frTTD . 

To explain why we believe (11.211) to hold, we note that [HE] indeed gives that the critical window 
consists of p values given hy p = pc + ijV~^^^/il, that is p = pc + /in"^""^/^ on H{d,n). Thus, 
(11.211) follows for all /i, as long as it holds for one particular value of p inside the critical window, 
for example, for p = Pc defined in ( 11.21) . for any A, for d fixed and n oo. Such asymptotic 
expansions of critical values in terms of the vertex degree have been established for the n-cube 
and for nearest-neighbour percolation on Z*^ in [T3| [H]. These expansions arise since the value 
Pc satisfies an implicit equation in terms of certain "Feynman diagrams" occurring in the lace 
expansion analysis, and these diagrams can be proved to obey asymptotic expansions that in turn 
imply that Pc has an asymptotic expansion. We expect that this part of the analysis in [131 E] 
can be extended to Hamming graphs, and will allow one to compute the numerical values of ai{d). 
The proof of this conjecture would enable an extension to random subgraphs of H{d, n) of the 
phase transition description available for the classical Erdos-Renyi random graph. 

For p inside the critical window, |Cmax| is of the order V"^^^ = n^''/^, as proved in [U [5]. Below 
the critical window, we expect that the average cluster size satisfies x{p) ~ [^{Pc —p)]~^, while 
the maximum cluster size satisfies 

IC^axI ~ 2x{py log {V/x{p)') whp. (1.22) 

Note that [1] establishes in full only the upper bound part of ( ll.22p . the corresponding best lower 
bound therein being jCmaxI > x(p)^/(3600co') whp for u large (cf. (II. 6p ). (It is the upper bound, 
however, that is relevant for locating the phase transition window.) We anticipate that above the 
critical window 

|C^ax|~2£y whp, (1.23) 

where e = ^l{p — Pc) ^ V~^^'^. Establishing the validity of the asymptotics in (I1.23P in full 
generality would strengthen Theorem 11.11 to all c? > 1 and all p above the critical window. 

2. Overview of the proof of Theorem 11.11 

This section contains an extensive overview of the proof of our main result, breaking it down 
into a number of key propositions and lemmas. We start by describing the general philosophy of 
the proof. 

From now on, we shall assume that p = Pc + s/^, where e > 0. As in [3], the proof will be 
centered on the investigation of the random variables 

z,,= J2 n\c{^)\>k], (2.1) 

veH{2,n) 

the number of vertices in clusters of size at least k, for appropriate values of k. In terms of these 
random variables, we have that |Cmax| > k holds if and only if Z^^ > 1. By proving sufficient 
concentration of measure for Z^,., we are able to prove bounds on |Cmax|- The whole proof revolves 
around finding the right scales of k to which we can apply our arguments. 

Specifically, we need two different scales. The first scale is the smallest possible scale k for which 
Pp(|C(v)| > k) is very close to 2e. If indeed the duality principle holds (see the discussion above 
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([OOD), then, by ffTTSD . we expect that 
Pp(|C(v)| >k)= Pp(|C(v)| >k,^re C^ax) +Pp(|C(v)| > fc, V ^C^ax) ~ 2^ + -^e'^^'/^^ ^2.2) 

As a resuh, as soon as A; ^ e"^, we are led to 

Pp(|C(v)|>A;)~2£, (2.3) 

so that also Ep[Zy^] = VFp{\C{v)\ > k) 2eV. Equation (EJD follows from Proposition O below. 
Assuming sufficient concentration of measure for Z^^., we then obtain that Z^^. ~ 2eV whp, and, 
since |Cmax| < for every k for which Z^^ > 1, we obtain the required upper bound on |Cmax|- 
Concentration estimates on Z>j. are stated in Proposition 12.21 below. 

The lower bound on |Cmax| is slightly more involved. Here we need to find the largest possible k 
for which we can prove that Z>(. is concentrated around its mean Ep[Z>fc] ~ 2eV. To achieve this, 
we perform a so-called two-round exposure. We first take < p such that p^ = p + o{e/Q), and 
compare clusters in percolation with parameter p_ to suitable lower-bounding branching processes. 
Note that such comparisons can only be applied when k <^ eV, so these bounds are rather "weak" . 
Subsequently, we "sprinkle" extra edges, so that the distribution of the final configuration is that 
of percolation with parameter p. We prove that all the large connected components in the p_- 
configuration are, in fact, whp, joined all together by the sprinkled edges. We now explain the 
steps in this argument in more detail. 

Since p- < p and satisfies p- = p + o{e/Q), all the concentration results for Z^^. hold also for 
Z'^^., the number of vertices in connected components of size at least k in the p_ —percolation 
configuration. Furthermore, again using the fact that p_ = p-\-o{e/Q), we have that Pp_(|C(v)| > 
k) ~ 2e, so that, by our concentration estimates, ~ 2eV whp for all k <^ eV. This establishes 
the necessary "weak" bounds on connected components of size at least k <^ eV. 

The p-configuration can be coupled to the ^.-configuration as follows. Let ?7 > be given 
hj p_ + {1 — p_)ri/Q = P- Then, make each vacant edge occupied with probability rj/Q, 
independently of all other vacant edges. We show that, for appropriate choices of t] (and thus 
and k <^ eV, the sprinkling procedure whp connects all ^.-clusters of size at least k into one. It 
follows that |Cmax| ^ ■Z^>fc ~ 2£:V^ whp, establishing the lower bound. This part of the proof makes 
crucial use of the fact that big components turn out to be quite "dense", in the sense that they 
contain many elements along most coordinate lines; details can be found in Proposition 12.41 below. 

As explained above, the entire analysis revolves around a delicate choice of the two different 
scales. We now present our precise results, formulated in Propositions 12. ![ 12.21 and 12.41 below. We 
then use these propositions to complete our proof of Theorem II. 1[ 

Proposition 2.1 (The cluster tail). Set p = Pc + f^- Let V~^/'^ ^ e ^1 as V ^ oo. Then, for 
every rj such that e~'^ <^ rjV <^ eV , 

Pp(|C(v)|>r/\/)=2e(l + o(l)). (2.4) 

Proposition 12.11 consists of two parts, corresponding to the upper and lower bounds. These are 
re-stated separately in Section 14.11 as Lemmas 14.21 and 14.31 and proved in Sections 14.11 and 14.31 
respectively. 

The following proposition shows concentration of measure for Z>j. for an appropriately chosen 

k > 

Proposition 2.2 (Concentration of the number of vertices in large components of certain sizes). 
Set p = Pc + fi CLiT'd let V''^^^{\ogVY^^ <^ £ <C 1. Then there exists Sq satisfying Eq <ti e such that, 
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for every 6 > 0, 

P,(|Z,,-. - E,[Z,_^-.]\ > 6ev) = 0(1). (2.5) 

The proof of Proposition 12.21 makes use of a somewhat dehcate second moment argument. We 
start by upper bounding the variance of Z>jv and Z>2n — for specific values of A^. These 
bounds are then combined to prove that ■Z^>e-2 is concentrated around its mean, provided that 
V"-V3 (logy) 1/3 < £ < 1. The proof can be found in Section 15. 2[ where we also show that a 
possible choice for Eq is eo = V^~^^'^, which indeed satisfies ^ = V'^l'^ ^ e"^, since e ^ V~^^^. 
For the remainder of this section, we only need to know that a suitable Eq does indeed exist; its 
precise value is irrelevant. 

Armed with Propositions 12. II and 12. 2[ we now prove the upper bound on |Cmax| in Theorem ll.il 
Proof of the upper bound part of Theorem \l.l\ We shall show that whp, |Cmax| < 2eV{l + o(l)). 
Choose Eq as in Proposition 12.21 By Proposition 12.21 the random variable is concentrated 

around 2eV . In other words, the number of vertices in connected components of size at least Eq"^ 
is close to 2eV whp. However, on the event > 1}, we have that 

|Cmax| < ^>,-2, (2.6) 

and so it follows that 

|C„,ax| < 25^(1 + Op(l)). (2.7) 

■ 

The following result is an easy corollary to Proposition 12.21 It shows that, in fact, concentration 
of measure holds for the number of vertices in clusters of size at least rjV, for all rj ^ e such that 
Tj^V ^ 1. This will be required for the proof of the lower bound of Theorem 11.11 as discussed in 
the proof overview above. 

Corollary 2.3 (Concentration of the number of vertices in all large components). Set p = Pc+ fi 

and let V~^^^{\ogVY^^ <C £ ^ 1. Let rj = r]{n) satisfy rj <^ e and rj^V ^ 1. Then, for every 
5 > 0, 

Pp(|Z>,, - Ep[Z>,,]| > 5eV) = o(l). (2.8) 

Corollary 12.31 allows us to use the concentration of Z^^v for any appropriate ?7, thus effectively 
removing the delicate choice of rj in Proposition 12.21 We shall see that Corollary 12.31 follows from 
Propositions 12.11 and 12.21 combined with a simple first moment estimate. 

Proof. Let us choose vj satisfying both t] <^ e and rj'^V ^ 1, so that in particular rjV ^ rj^^ ^ e^^. 
Choose further Eq as given in Proposition 12.21 We shall assume that Eq'^ < r]V] the proof when 
Eq'^ > rjV is a simple adaptation of the argument below. By Proposition 12.11 for any fixed 5 > 0, 

Pp(|Z>,, - 2eV\ > 6eV^ < Pp(|^>,-2 - Ep[Z^^-2]\ > feV/3] 



+ Pp(|Z>,,-Z>,-2| > fey/3), (2.9) 
provided that V is large enough. More precisely, the volume V must be such that 

Ep[^>,-2] - 2eV < 6eV/3, (2.10) 

or, equivalently (using transitivity), 

Pp(|C(v)| >£o2)_2e <5e/3. (2.11) 
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P. 



By Proposition 12.21 



P,(|Z,,-. -E,[Z,,-.]| > = o(l). (2.12) 

Further, since Eq'^ ^ the Markov inequaUty, together with the fact that £0 < yields that, 
for every 6 > 0, 

- > 6eV/3) < -'^^^ (2.13) 

= |[P.(|C(v)| > e^') - P,(|C(v)| > vV)] = o(l), 

where the last equality follows from Proposition 12.11 together with the fact that s'"^ <^ rjV ^ eV 
and e"^ <^ ^ ^ eV. Equation (12.131) thus completes the proof. □ 

It remains to establish a matching lower bound for ICmaxI? and we shall do this via a "sprinkling" 
argument. (Sprinkling is sometimes referred to as the "two-round exposure", see [HI Chapter 1].) 
This part of our proof is based on two results below. Before we state them, we need to introduce 
some more notation. 

For each i, the i-th horizontal line of if (2,n) is defined to be the set {{i,x) : x = 1, . . . , n} of 
vertices with first coordinate i; similarly the set {(x, z) : x = 1, . . . , n} of vertices with the second 
coordinate equal to i constitutes the i-th vertical line. A vertex belonging to a given line is said 
to be an element of that line. 

Proposition 2.4 (Lower bound on the number of line elements in a large cluster). Set p = Pc + ^■ 

There exists a constant C > such that the following holds. Fix e,r] satisfying V~^^^ -C e <^ 1, 
rj <^ e, rjV ^ and rjV/n > Clogn for n sufficiently large. Then whp for every cluster of size 
at least r]V , there are at least ^ horizontal lines each with at least r]V/ {in) elements contained in 
the cluster. 

The proof of Proposition 12.41 is deferred to Section 14.31 Assuming it holds, we now prove that 
the second round exposure will join together every pair of large clusters formed during the first 
round. In the following lemma, for a pair of sets of vertices 5*1, S2, we use the notation 5*1 < — > S2 
to denote the event that Si, S2 are joined together. We also write Si ^-/-^ S2 to denote that Si, S2 
are not joined together. 

Lemma 2.5 (Sprinkling). Set p = Pc + fj- Choose V~^l^ <^ £ ^ 1. Let rj = ^V'^l^ , and 
let Si, S2 be disjoint sets of vertices both containing at least riV/{'in) elements of at least 3n/4 
horizontal lines (possibly different lines for Si and S2). Then 

^rj/u{Sl " ^2) > 1 - o . (2.14) 



Proof. Choose two disjoint vertex sets Si and S2 each containing at least rjV/'i elements in at least 
3n/4 horizontal lines. Then 5*1 and S2 must have at least n/2 such lines in common, that is both 
Si and 5*2 contain at least r]V/{4:n) = rfn/A elements of these lines. Note that, since e 3> V~^^^, 
7] = ^/eV~^l^ and = n^, we have rfV/n = ^V^l"^ ^ 1. Along the shared good lines, there are 
at least (?7n)^/16 edges with one endpoint in 5*1 and the other in 6*2. All of these edges will be 
occupied independently under Pr^/f^, so 

P,/n(5i ^ S2) < (1 - ■ (2.15) 
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Using the inequahty 1 — x < e ^ and the fact that e ^ V 

^rMSi ^ S,) < e-^'-'l'' = e-^^^^^^V64 ^ ^.v'/^y = ^(i), (2.16) 
which completes the proof. □ 

We can now do the lower bound part of Theorem 11.11 
Proof of the lower bound part of Theorem ] 1. 1\ We choose 77 = ^/eV~^^^, as in Lemma \2.5\ and 
note that the results in Proposition 12.41 apply to this choice of 77. Indeed, since e ^ V~^^^, we 
have that rj = e/ (veV^) <C e and rjV = ^JeV^I^ ^ e~'^. Finally, once again using e ^ we 
have rfV/n = ^/eV^I"^ ^ 'V^l'° > Clogn for n sufficiently large. 

We define p_ by the relation 

p. + {l-p.)^=p. (2.17) 

Note that every configuration with edge probability p can be obtained in a unique way as follows. 
First construct a configuration by throwing in edges independently of one another with probability 
p_; subsequently, "sprinkle" extra edges with probability ^, independently of one another and of 
the p_ configuration. In the final configuration, an edge is occupied precisely when it is occupied 
in either the p_ configuration, or when it is an edge that is added during the sprinkling procedure. 
Since r] e, 

P-=p + o(-). (2.18) 



Let Zl^y denote the number of vertices in connected components of size at least rjV in the p_ 
configuration. Since 6 in Corollary 12.31 is arbitrary, it implies that ZL,^y = 2eV{l + Op(l)) after 
the first round of exposure; and by Proposition 12.41 whp every cluster of size at least rjV includes 
at least -qV/ (4n) elements in at least ^ lines. Thus, under the measure Fp_ , whp there are at 
most ^(1 + 0(1)) connected clusters of size at least rjV, and each of these connected components 

contains at least rjV/ {4n) elements in at least ^ lines. 

It now suffices to prove that whp the subsequent sprinkling procedure (second round of expo- 
sure) joins together every pair of clusters of size at least rjV. Indeed, if this is the case, then after 
the sprinkling we end up with a single connected component of size at least 

Zl_^y>2eV{l + o^{l)). (2.19) 

Let Vi and V2 be two vertices such that C(vi) 7^ C(v2), and |C(vi)| > rjV and |C(v2)| > rjV. 
Let us take Si = C(vi) and S2 = C(v2). By Proposition 12. 4^ we may assume that for both Si and 
5*2 one can find at least ^ lines (not necessarily the same ones for Si and 5*2) each with at least 
rjV/lAn) elements in 5*1 and 5*2. Then, by Lemma [2.51 

.y-1/3 



P,/n(5i ^ S2) = o[^—). (2.20) 

But whp there are at most (1 + o(l)) = 0{eV^^^) distinct choices for C(vi) and C(v2) with 

|C*(vi)| > rjV and |C(v2)| > rjV, and so a simple union bound implies that after sprinkling whp 
all connected components of size at least rjV are connected. By f l2.19p . this completes the proof. I 

The remainder of the paper is organised as follows. In Section [3] we prove auxiliary results 
relating to the tails of the total progeny of binomial Galton- Watson processes. Section H] contains 
proofs of Propositions 12.11 and 12. 4t therein we investigate the structure of percolation clusters 
(cluster tails and the number of elements per coordinate line) by comparing them to binomial 
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Galton- Watson processes. Finally, in Section [5], we establish concentration of measure for the 
number of vertices in large clusters, thus proving Proposition 12.21 

3. Total progeny of a Galton- Watson process 

This section brings together some useful results from the theory of branching processes, which 
will play a key role at various stages in our proofs. 

We consider a standard Galton- Watson process whose offspring distribution Z is a binomial 
Bi{N,p), where N & N and p G [0, 1] is the Hamming graph edge probability. We assume that 
with probability 1 the process begins with one individual. We write F^^p for the probability measure 
corresponding to this process (implicitly assuming an underlying sample space and a-field). 

Let F be the total progeny or family size. Our aim is to prove the following three results con- 
cerning the distribution of F. Proposition 13.11 compares the distribution of F under the measures 
Pat j,, p for different values of and A^. Proposition 13. 21 estimates the probability that the value 
of F is between i and 2i, for some (large) integer i. Proposition 13.31 estimates the probability that 
F takes a value at least £ for some large integer i. 

Proposition 3.1 (Tails of total progeny in two binomial branching processes). Let ^ G N. Suppose 
that N E N and N = N{N) satisfy N > N. Further, assume that e = Np — 1 is such that e — >• 
and e > N~'^^'^ ; and that e = Np — 1 > and \e — e\ = o{e) as N ^ oo. Then, for some constant 
C > 0, as N ^ oo, 

I^nAF >i)- P^,,(F > ^)l < C(k - ^1 + ^ + (3.1) 

Proposition 3.2 (Bounds on the total progeny distribution). Let N,i E N. Suppose that e = 
Np — 1 satisfies e — > and e > A^~^/^ as N ^ oo. Then, for some constant C > 0, as N ^ oo, 

P;v,p(FG[£,2£])<^. (3.2) 

Proposition 3.3 (Tails of the total progeny near criticality) . Let N,£ E N. Suppose that e = 
Np — 1 satisfies 5^0 and e > A^~^/^ as ^ oo. Then, as N ^ oo, 

FN,p{F>tj = 2e + 0{e^)+0[^^. (3.3) 

We note that with a little care and minor modifications. Propositions I3.1H3.3I could be extended 
to the case where £ is a positive constant (i.e. strictly above the critical window). In the corre- 
sponding statement, (13. 3p would have Ci+e instead of 2e, where, as before, Ca denotes the survival 
probability of a Poisson Galton- Watson process with mean family size A. 

Our proofs of Propositions 13.11 and 13.21 will make use of the well-known Otter-Dwass formula, 
which describes the distribution of the total progeny of a branching process, see [H], [23] . We begin 
by stating a special case of this formula (due to Otter) for a branching process starting with 1 
individual. (The formula was later extended by Dwass to a process starting with r individuals, for 
arbitrary r G N, but we do not make use of the extension here.) 

Lemma 3.4 (Otter-Dwass formula). Let Zi, Z2, Z3, . . . be i.i.d. random variables distributed as Z . 
Let P denote the Galton- Watson process measure. For all G M, 

1 

P(F = k) = T^iYl Zi = k-l). 
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We now prove each of Propositions I3.1H3.3I in turn. 

Proof of Proposition \3.1\ It will be convenient for us to introduce new parameters A = pN and 
A = pN . By assumption A, A > 1, so that both branching processes are supercritical. The total 
progeny size in the Bi(A^,p) process will be denoted by F. 
By Lemma 13. 4[ for each /c G N, 

k 

P^,p(F = k) = \nY. Z^ = k-l), (3.4) 

i=l 

where the Zj are i.i.d. Bi(iV, A/A^). We now investigate the asymptotics of the formula (13.41) as 
N ^ oo, for large integers k = k{N). Our aim is to obtain estimates for PAr p(F = k) sharp enough 
for the errors to be summable. 

In outline, our calculation is as follows. First we demonstrate that, if /c > Cq^"^ log(l/£:) for a 
sufficiently large constant Co, then 

^N,pik <F<oo)< k-\ (3.5) 

Next we show that (13. 5p implies an identical upper bound for P^p(F = k). Subsequently, we prove 
that there exists a constant C > such that, if /c < Coe'"^ log(l/£:), then 



'(^ = k)- ^nA^ = k)\ < ^exp ( - ^^^) |(1 + ke)\e-e\ + ^^+j^ + ^[ (3.6) 



C { {k- l)e' 
-N,pyr = K) - ^yv,pl^ = ^ P72 I r 

We can then sum the errors in (13.61) and (13. 5p to show that, for any £ G N, 

I^nA^ <F<oo)- P^,p(£ <F<oc)\<c[\e-e\ + ^ + i) . (3.7) 

Finally, since A > 1, we need to estimate F^A-^ ~ ^) ^^'^ pi-^ ~ shall show that 

\Fn,p{F = oo)- P^^p(F = oo)\ <C\6- e\. (3.8) 

Combining the last two estimates yields Proposition 13.11 As we shall see below, several steps of 
our proof will also play a role in proving Propositions 13.21 and 13. 3[ 

Let us make a start on the details. To show (13. 5p . we note that (13. 4p implies 

k 

P^,p(F = A;) < ^P(^ Z,<k), (3.9) 



1=1 



where ^^L^^ Zi ~ Bi{kN,p). But if Z ~ Bi(n,p), then (see for instance [T6] ) 



F{Z<np-t)<e (3.10) 

Applying (13.101) with n = kN, p = and t = np — k = ke (and also using our assumption that 
e < 1), we obtain that 



Fn„(F = k) < -e ^^'+^^ < -e- — . (3.11) 
k k 



But if > CqE ^ log(l/£:) for a sufficiently large Co > then 



ie"^ < k-\ (3.12) 



SO (13. 5p follows. 
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We now show that (13.61) holds for k < CqE"^ log(l/£:). Clearly, (13 ■4p implies that, for each A; G N, 

(\ k-l / \ kN-k+l 

By Stirling's formula, 

-,\/ -,\ r I ' I 

('m)r := m[m 



(T r ( r 

~ — 

Applying the above approximation with m = kN and r = k — 1 (and noting that ^ = 0{^)), 
we arrive at 

^ l(kX)''-^ ( {k-\f {k-\f ^(k\\(_^ x.kN-k+i 

^"-(^ = = *I*3T^ ^''K - Hwf - W + Kiv^)) - iv) 



k\ "^K 2N N 2kN QN^ \m mJJ\ N 
Observe that 

XskN-k+l X 



(l - -) = exp {{kN -k + l) log(l - -)) 



/ , , kX X X'^k X^k X^k ^, \ k . 
= exp(-AA: + ---- — + ^-^ + 0(- + - 

so that 

^fc-ig-(fc-i)g-A k[X-\f 1-A ] 



P^,p(F = k) = exp((A: - l)/(A))exp ( 



27V N 2kN 

x-p(]^^(^)+0(^ + _!)), (3.14) 

where 

/(A) = log A - (A - 1), g{X) = + y - y • (3-15) 
The Taylor expansion for |A — 1| small gives 

/(A) = -^^y^ + 0(|A-l|^). (3.16) 

But A - 1 = £ and A; < Cq^-^ log(l/£:), where iV-^/s ^ ^ = ^(i)^ and so k\X - 1|^ = o(l), 
uniformly for all such k. The other error terms can be bounded similarly, and hence, uniformly 
for k < Co£~=^log(l/e), 

P^,,(F = k) = {l + 0(1))^^^ exp {-\{k- 1)(A - If) . (3.17) 

We now compare Pjv,p(i^ = k) to P^^p(F = k) for k < CqE^^ log(l/e). Write X/N = X/N, where 
A = NX/N < X. Then a calculation similar to the one above shows that 

^fe-lg-A-(fc-l) 



^nAp = ^) = y, {(^ - 



k\ 

( k{X-\f l-A 1 k k 1 

X exp ^ — H ^ ^ + -^fi'(A) + O + 

V 2N N 2kN m Vats a^2 
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By assumption, e = A — 1 > and e = X — 1 > 0. Further, 

f{X)-f{X) = {e-e)f{l + s), (3.18) 

for some s G {B,e), where 

|/'(l + s)| = ^<s, s>0, (3.19) 

and so 

f(X)-fCX)=0{e\e-e\). (3.20) 
We deduce that, for e > N'^'^ and all k < CQe~'^\og{l/ e), 

|P^,,(F = k)- P^^p(F = k)\ = ^^^^j exp{{k - 1)/(A))| exp(x) - exp(y)| (3.21) 

where 

ke"^ k ^( k e I \ 

^ = -m^mf^''''\lP^N^kNh 

y =(£-£)- + -^g(X) + 0{ke\e - e\) + o{-^ + t + 

^ ' 2N m \m N kNJ 

Since k < Coe~^ log(l/e), N^"^/^ <^ e = o(l) and e = o{e), x = o(l) and also all contributions to 
y are o(l), except possibly for the term ke\e — e\. Now, for some constant C, 

I exp(x) - exp{y)\ < C\x - y|el^l^l^l, (3.22) 

where, for u,v E M, u\/ v = max{n, v}. Note that x = o(l) and, since |£ — = o{e), we have that 
y = o(l) + o{ks'^). As a result, we obtain that, for sufficiently large, 

I exp(a;) - exp{y)\ < C\x - y\e^''-'^''^\ (3.23) 

Since further — = 0{N~'^\e — e\), the contribution to \x — y\ due to the term g{X)k/N'^ — 
g{X)k/N'^ is 0{kN~'^\6 — e\) = o{ke\e — e\), which gives 

\x-y\<Cil + ke) k - ^1 + O + ^ + ^) . (3.24) 

Hence, combining (ICTD . ([323]), fl336D . for all k < Coe~'^\og{l/ e^), we arrive at ([31]). 
Summing the estimates (13.51) and (13. 6p over k > i, 

|F^,p(£ < F < oo) - P^p(£ < F < oo)| 

,oe((ih-mi.-.-ka.^.^) -"-':-^-^^^' .e^-' 

k>£ k>i 

The final contribution is 0{i~^); for the remaining terms observe that 

'Cf-'' for a > 1, 



^^^,-.exp(-(.-l)eV4)<|^^__ j^^,^^^; (3.25) 
which yields (with a suitably adjusted value of C) 

1 e 1 



I^nA^ < F < oo) - P^,^(£ < F < oo)| < - 5| + — J + ^ + 



. 1 1\ , , 
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To prove (13.81) . we need to estimate \a — d\, where a = Fiy^p{F < oo) and a = P^p(F < oo). 
The quantities a, a respectively are the smallest positive roots of the equations 

a = (l + A(a-l))^ and a = (1 + A(a _ i))^. (3.27) 

Using the convexity of probability generating functions and the supercriticality of the branching 
processes in question, the equations in (I3.27P each have precisely one root a and a respectively in 
the interval [0, 1). 

The proof is divided into two main steps. In the first step, we prove that 1 — a = 2e + 0{e^), 
which also implies that |a — a| = o{e) when |e — e^l = 0(e), so that we may use a Taylor expansion. 
In the second main step, we prove that |a — a| < C\e — e\. 

To prove that 1 — a = 2e + 0(6^), we expand the right hand side of (13.271) to obtain 

a-l^(:)>-l).(^)(^)%-l)^.0(,a-l,3) 

l = A + ^A^(a-l) + 0(|a-lp) 

1 = 1 + e + ^(1 + efia - 1) + 0(|a - l\% 

so that 

(1 - a)(l + 2e + e^ ~ N'^ - 2N-^e + 0{N-h^)) = 2£ + 0(|1 - ap), 
and so, again using that e > A^"^/^, 

l-a = 2e + 0{e^). (3.28) 
To prove that |a — a| < C\e — e\, we use that 

a-d = (1 + A(a-1))^-(1 + A(a-l))^ = /^(a)-/^(a) + /^(a)((l + A(a-l))^-^-l), (3.29) 
where f^{x) = {l + j^{x-l))'^. Note first that 

(1 + A(a _ i))N-N _ 1 ^ 0(^^(1 - a)). (3.30) 
Further, since \a — d\ = o{\a — 1\) , we have that ff;f{a) < 2. Also, 



(3.32, 

and it is not hard to see that /^(x) = 0(1) uniformly for x G [a, a]. Hence 

a - a = (a - 5.)f'^{d.) + 0(^ ^~^ (1 - a)) + 0((a - df), (3.33) 

so that 

- ^/ (iV-iV)(l-a) ^ I- (a -if ^ 
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A closer inspection of f'^{x) yields that f'^{a) — 1 = e + 0(5), so that 



a-~a\<0[ '^-^h + O f <C\e-e\. (3.35) 



N 

This completes the proof of Proposition 13.11 

Proof of ProposiUon\3M By f iXTTD . for all k < Cq^^^ log(l/e) 

P^,,(F = k) = {l + o(l))^^/^t!^ exp {-\{k- l)e- 

Also (provided Cq is large enough) for k > Co^^^ log(l/e), 

Pjv^p(F = A;) < k-^. 

Summing over i < k < 2£ we obtain 

^nA^ <F<2£)<C (tL^~'"'^' + ^"') < ^ 



VA;3/2 7 -£1/2' 

l<k<2£ 

where the constant C was adjusted within the final inequality. I 

Proof of Proposition \3.3\ We have 

Pjv,p(i^ >£) = l- ^nA^ < 00) + Pjv,p(^ < F < 00). (3.36) 

By ( I3.28p . the term 1 — P7v,p(-^ < 00) is 2e + 0(£:^). Calculations similar to those in the proof of 
Proposition 13.21 show that the final term is bounded by 0(£~^/^), which completes the proof. ■ 



4. Comparisons to branching processes 

In this section, we use comparisons to branching processes and concentration of measure tech- 
niques to study the cluster tail probabilities (cf. Proposition 12. II) . as well as the cluster structure, 
specifically, the number of vertices per line, of large clusters (cf. Proposition 12. 4p . 

This section is organised as follows. In Section H?H we describe a cluster exploration procedure, 
state key estimates for the tails of the cluster size distribution, and prove the upper bound part of 
Proposition 12. 1[ In Section 14.21 we establish an upper bound on the number of elements per line 
in a large cluster; this result is a crucial ingredient in the proof of Proposition 12.41 Section 14.31 
contains a proof of the lower bound part of Proposition 12.11 as well as a proof of Proposition 12. 4[ 

4.1. Component exploration and strategy of proof. We take an initial vertex vq = {xQ,yo) 
and explore its cluster, C(vo), by exploring the vertices in that cluster successively one at a time, 
in a breadth- first order. Exploring a vertex {x,y) means that we consider all the edges {x,j) for 
j 7^ y in the order of increasing j, and decide for each one in turn if it is open with probability p 
or closed with probability 1 — p; then we do the same for the edges (i, y) ioi i ^ x in the order of 
increasing i. Note that, until the moment all available vertices in the cluster have been explored, 
the number of explored vertices at time t is equal to t. 

Let us introduce colours as follows. At time t, all vertices that have not yet been explored 
and are not yet contained in C(vo) are white. All unexplored vertices connected to vq (that is, 
included in C(vo)) at time t are green. All explored vertices are red. (Thus, in particular, at time 
all vertices are white except for vq, which is green.) In fact, we need to modify this exploration 
process slightly as follows: when exploring a green vertex we only consider those of its edges where 
the other endpoint of the edge is white. If such an edge is found to be open, then we colour its 
other endpoint green. 
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Let Ct(vo) be the set of vertices included in the cluster of vq by the time t. Let also Gt(vo) be 
the set of green vertices in the cluster at time t. Thus Ct(vo) consists of all green and red vertices 
at time t, and -Rt(vo) = C((vo) \ ^((vo) is the set of red vertices at time t. All the remaining 
vertices in the graph are white. 

Let denote the smallest time t when there are no green vertices remaining, that is T^^ = 
mi{t : |G*j(vo)| = 0}. Note that T^^ = |C(vo)|, the size of the cluster of vertex vq, and |i?t(vo)| = t 
for all t < TvQ. Choose a parameter rj = r]{e, V) such that < 77 <^ e and let 

T = T,,A\VV] (4.1) 

be the minimum of Tvg and [r/V] . 

Given an integer i E {I, . . . ,n}, let C((vo,i) be the set of vertices {i,y) included in the cluster 
at time t. (This is the collection of all the elements of the i-th horizontal line added by time 
t during the exploration procedure.) Let also C(vo,'i) be the set of all vertices {i,y) in C(vo), 
that is, the collection of all the elements in the i-th horizontal line contained in C(vo). We 
further denote the number of elements of the i-th horizontal line included in C(vo) until time T 
by iV(vo,2) = |CV(vo,2)|. 

Similarly, let Ct(vo, i) be the set of vertices {x, i) included in the cluster at time t, that is all the 
i-th vertical line elements added by time t during the exploration procedure.) Let C(vo,i) be the 
set of all vertices (x, i) in C(vo); and, finally, denote the number of elements of the i-th vertical 
line included in C(vo) until time T by N{yQ,i) = [(^^(vo, i)]. 

We write {xt, yt) for the vertex that is explored at time t if such a vertex exists, that is, if t < T. 
We may identify the set of colours with the set {0,1,2}. The state of the exploration process 
at time t is the list giving the colour of each vertex, in other words, an n-vector with values in 
{0, 1, 2}". This process defines a natural filtration v^q C (^^^ C . . . C ^9^, where (ft is the smallest 
(T-field with respect to which the state at time t is measurable. (Informally, (ft corresponds to 
"everything that has occurred until time t".) We note that T is a stopping time with respect to 
this filtration. We note also that, even on the event {T = [?7V^]}, it is not necessarily the case 
that Ct(vo) = ItiV] , since the number of new vertices added at each exploration step is a random 
variable, which can be smaller or greater than 1. We stop our process at time T, and we make 
the convention that Ct(vo) = Ct(vo) for all t > T (and similarly for all other relevant random 
variables). This is important when T = Tvg < rjV, that is, when the process dies out before time 
r]V. 

Following the notation of Section [31 we let F denote the total population size of a Galton- Watson 
process starting with one individual, where the offspring distribution is Bi{Q,p)] and further ^n,p 
denotes the probability measure corresponding to this branching process. Proposition 12. II involves 
upper and lower bounds on Pp(|C(vo)| > i) for appropriate choices of i. These bounds are 
formulated in Lemmas 14.11 - I4l3] below. 

Lemma 4.1 (Stochastic domination of cluster size by branching process progeny size). For every 

eeN, 

Pp(|C(vo)| >£) <Pn,p(F>^). (4.2) 

The result in Lemma 14.11 is standard, and we will omit its proof. In essence, it follows since 
in the cluster exploration, from each vertex being explored, at most Bi{Q,p) new vertices can be 
added to the cluster, independently of what has already been added. Thus the total cluster size 
must be at most the total population size of the binomial Galton- Watson process, as claimed. 

Lemma 14.21 below follows directly from Lemma 14.11 and Proposition 13.31 and establishes the 
upper bound part of Proposition 12.11 It is also used in the proof of Lemma 15.21 in Section [51 
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Lemma 4.2. For every £ G N, and for e > V^^^^ , 

Pp(|C(vo)| >£) <2£ + 0(£2) + o(- 

In particular, if e'"^ <^ riV <^ eV , then 

Pp(|C(vo)|>r/y)<2£(l + o(l)). (4.3) 

Proof. By Lemma 14. 1^ for every £ G N, 

Pp(|C(vo)| >£) <Pn,p(F>^). 

Our choice of p = p{n) implies that Vtp = 1 + £ > 1, that is, the Bi(f2,p) Galton- Watson process 
is supercritical. By Proposition 13.31 

^nAF > = 2£ + 0(6^) + O (-^) . 

For i = r]V, we have that rjV ^ e"^, and so = o{e), which completes the proof. □ 

Our next lemma establishes a lower bound on Pp(|C(vo)| > i), that is, the lower bound part of 
Proposition 12.11 

Lemma 4.3 (Stochastic domination of cluster size over branching process progeny size). For every 

Pp(|C(vo)| >l)> P^,,(F > l)+0{y-'), (4.4) 

where = — | max{£n^^, C logn}. 

Consequently, if e ^ V~^^^ , rj <^ e and e"^ <^ riV , then 

Pp(|C(vo)| >77^) >2£(l + o(l)). (4.5) 

Lemma 1473) is proved in Section l473l where we show that the cluster size stochastically dominates 
the Bi(fi,p) Galton- Watson process. 

4.2. Upper bounds on the cluster size and structure. In this section we give an upper 
bound on the number of elements of a large cluster that belong to a particular horizontal line. The 
following proposition is crucial in the proofs of Proposition 12.41 and Lemma 14.31 

Proposition 4.4 (Upper bound on the number of elements per line in a large cluster). Let 

e = e{n) > be such that e = e{n) < 1/20 and choose t] <^ e. Further, let A^(vo, i) be the number 
of elements of C-riyo) = Cp^yi(vo) that belong to the horizontal line i. There exists a positive 
constant c\ such that for every > 

Ppf max : A^(vo,i) > (1 + z/)— r/n) < ne"^!^^". (4.6) 

Vi=l,...,n 9 / 

Furthermore, there exist constants C2, C3, C4 > such that the following holds: 

(1) Let n G N and rj = rj^n) be such that rjn > 02 logn. If n is sufficiently large, then 

Ppl max 

: N{^ro,i) > -vn) < c^V'^ (4.7) 

V i=l,...,n 4 / 

(2) Let n G N and rj = ri{n) be such that rjn/ logn < €2- If n is sufficiently large, then 



Pp 



( max : iV(vo, i) > C3 logn ) < c^V ^. {i.t 

\i=l,...,n / 
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Here is an informal outline of the proof. Whenever we explore a vertex not on the line i, we add 
an element of line i with probability p. On the other hand, each vertex belonging to the line i has 
n — 1 neighbours on that line. Whenever such a vertex is explored, each one of its neighbours on the 
line i is included with probability p (unless it is already in the cluster). It follows that the number 
of new elements on line i resulting from exploring a vertex belonging to that line is stochastically 
dominated by a Bi(n — l,p) Galton- Watson process. Since p = ^^n-i) £<l/2<lforn large 
enough, we have that (n — l)p < 1, so that the Galton- Watson process is subcritical. Hence, using 
standard concentration of measure techniques, we are able to upper bound the number of elements 
on line i that make it into a large cluster. We now make this argument precise. 

Proof of Proposition 4-4 Let i G {1, . . . , n} and, for each t = 1, . . . , T, let St{i) be the number of 



times s such that {xs-i,ys-i), {i^Us-i) is open and Xg-i 7^ i- That is, for each time t <T, St{i) is 
the number of times we enter the horizontal line i until time t. We can write 



s=l 



where Yt{i) is the indicator of the event that the edge between (x^^i, {i,yt-i) is open, and 

Xt~i 7^ i- For each time t, 

Pp(yt(z) = i\^t-i)<p. 

and so known results (see for instance Lemma 2.2 in [20]) imply that St{i) is stochastically dom- 
inated by a Bi(t,p) random variable. Consequently, for every u > 0, the moment generating 
function Ms^.{i){u) is bounded above by (1 + p{e'^ — 1))*. 

For r = 1, 2, . . ., let Zj.{i) be the number of vertices (z, x) added as a result of the r-th entry on 
to horizontal line i. Given that vertex (i, xq) G C(vo), the number of its neighbours (i, x) added to 
C(vo) during its exploration (if it has occurred by the time VjV) is easily seen to be stochastically 
dominated by a random variable Bi(n — l,p). Hence, for each r, Zr{i) is stochastically dominated 
by the total progeny in a branching process with offspring distribution Bi(n — l,p) descending from 
a single individual. Since p = (1 + e)/2{n — 1) and e < 1/2, this branching process is subcritical. 
We deduce that, for m > 0, the moment generating function Mz^(i){u) of Zr{i) is bounded above 
by the moment generating function Mz{u) of an integer- valued, finite random variable Z, whose 
distribution is given by the Otter-Dwass formula (Lemma l3.4p . In other words, for each N eN, 



N 



where the C,r are i.i.d. Bi(?7, — l,p). It follows that 



^-^^) = E V^^^^^^^"" -l),p) = N-l) 



N 

N=l 



Af=l ^ ^ 



1)-(7V-1) 



Our aim is to derive an upper bound for the above expression. Unlike the branching processes 
considered in Section [3l which were (slightly) supercritical, we are now subcritical. Recall that the 
expected total progeny of a Bi(m,p) Galton- Watson process is jz^S using this fact with m = n — 1 
and p = 2ln-i) i see that E[Z] = 
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As N\ > (N/e)^, we have 

N\ N-1 - Nl - N ' ^ ' 



which in turn imphes that 



1+e nTV-I/ 1+5 sN{n~iy{N^l) 



MM <y—{n- if-'e-( r\) - \) 



N=l 

Since 1 — x < e^^, we see that 



MM 



^—e^Ni^^ ) ' """" • 

Ar=l 



which is finite for < n < — 1 — log and n large. Further, it is easily seen that for such u 
and n the contribution due to terms with N > Clogn is negligible, provided that C is a sufficiently 
large constant. 

Clearly, Cj(vo, i) is always bounded above by ^^=1 ^rii)- In particular, X]f=i*'' Zr{i) is an upper 
bound on Cr(vo,2), which equals the number of vertices {i,x) included in the cluster of vq from 
time to time rjV. 

Let Mo = |(^^ — 1 — log^^). Since Z is non-negative, Mz{u) > 1. It follows from the above 
(using also the bound 1 + x < e^) that for < m < uq, 

MNi.,M < Ms^,i^{\ogMM) < {l+p{MM - 1))^"''^ (4.12) 

< exp (^\7]V]p{MM - '^)) 

= (1 + o(l)) exp (^r/n(l + e){MM - 1)) 

< (1 + o(l)) exp rin^^u + ^r/n(l + e)u^M'^{uo) ) , 

where the final inequality comes from a second order Taylor expansion. Also we have used the 
fact that K[Z] = 2/(1 — e), and that the second derivative M'^{u) is increasing in u for n < uq. 
Now we run a standard large deviations argument. For all k, 



e" 

< (1 + o(l))exp [{vn^^ - k^u + ^r]n{l+e)u^M'^{uo)y (4.13) 

The expression in (14.131) can be optimised with respect to u in the usual way, and one finds that 
there exists a constant Ci > such that for all u > 0, 



Fp(^N{^r,i) > {1 + u)r]n^ 

which yields 



Fp( max N{'v,i) > (1 + iy)rin — — — ^ < ne 

\ i=l,...,n 1 — 6 / 
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Since e < 1/20, we have {l + e)/{l—e) < 11/9; and therefore there exists a constant Ci such that, 
for z/ > 0, 

P„( max iV(v,z) > —(1 + u)r]n) < ne"^''^'^''", (4.14) 

j=l,...,n 9 

which proves the first statement of Proposition I4.4[ 

For the remainder, first suppose that rjn/ log n > 176/ci; we may take = 1/44 in (14.141) to 
deduce that 

Ppf max iV(v,z) > ^r]n) < = V'^. 

\i=l,...,n 4 / 

Now assume that rjn/ logn < 176/ci. Note that A^(v, i) is stochastically dominated by J2r=i Zr{i), 
where S = Bi([?7l^],p). Since rjV < 176 lognn/ci, N{\,i) is stochastically dominated by a 
Bi( [176cj'^n logn] ,p) random variable. We can perform the moment generating function and 
large deviations calculations as in (14.111) - (I4.13P above, to find that 

/ 220 \ 

P„( max A^(v,z) > — logn) < rie~^^°s"(^+°(^)) = o{V~^). 

\i=l,...,n Ci / 

Taking C2 = 176/ci, C3 = 220/ci and C4 = 1 completes the proof of Proposition 14.41 I 

4.3. Lower bounds on the cluster size and structure. In this section we establish correspond- 
ing lower bounds on the cluster size and structure. We first give a proof of Lemma |473| which will in 
particular estabhsh a lower bound on Pp(T = [77^^]), that is, a lower bound on Pp(|C(vo)| > tiV). 
Our argument will rely on a coupling with a suitable lower bounding Galton- Watson process and 
the estimates of Proposition 14.41 

Proof of Lemma \4 ■ 3\ As in Sections 14.11 and 14. 2[ Ct{vQ,i) denotes the set of vertices {i,y) (with 
i G {1, . . . ,n} is fixed and y G {1, . . . ,n} varying) included in the exploration of cluster C(vo) 
until time t; and C*f(vo,i) denotes the set of vertices (x, z) included until time t. 

Let Ci be as in Proposition 14. 4[ Let m = |max{?7n, ^^logn}. For each time t, let £t be the 

event that |C((vo, i)\ < m and |C*((vo, i)\ < m for alH = 1, . . . , n. 

Define Q = Q — 2m = 2{n — 1 — m). Then, provided that the event St occurs, conditionally on cpt 
(that is, given everything else that may have happened until time t), the number of vertices added 
to Ct(vo) as a result of exploring {xt, yt) stochastically dominates a Bi(f2,p) random variable. We 
note that Q — Q = 0{rin + logn) = o(n), since 77 — > as n — 00. 

We shall couple our exploration process with a Galton- Watson process starting with a single 
individual, where the offspring distribution is Bi{Cl,p). The mean offspring size for this Galton- 
Watson process is 

Qp.-l + i= 1 r (l + £) = l + e-0{r] + n-Hogn) = 1 + e(l + o(l)), 

V n — 1/ 

where we have used the fact that r] <ti e and n"^/"^ <^ e. By Proposition [231 its survival probability 
is 2e + 0{r] + logn + e^) = 2^(1 + o(l)). 

Recall the exploration process and its corresponding colours as described in Section 14.11 Let 
Ft be the population size of the Bi{Q,p) Galton- Watson process and let F/ be the set of green 
or active individuals in the Galton- Watson process at time t. Also, let F = sup^ Ft be the total 
population size of the Galton- Watson process. Finally, recall that Cf(vo) is the set of red and 
green vertices in the exploration of the cluster of vq, and G't(vo) the set of green or active vertices 
in the cluster exploration. By construction, Ct(vo) C C(vo) for every t > 0. 



22 REMCO VAN DER HOFSTAD AND MALWINA J. LUCZAK 

By the above, on the event £t intersected with the event that |Ct(vo)| > Ft and |Gf(vo)| > -F/, 
given (^t, we can couple the Galton- Watson process with the cluster exploration processes for 
another step so that |Cj+i(vo)| > -Fj+i and |(j't+i(vo)| > 

It follows by induction that for each t the random variable |Cj(vo)|/[£^t] is stochastically at least 
Ftl[£t\- Hence, for each k, 

Pp(|Ct(vo)| >k)> ¥,{£t n {|a(vo)| > k}) > P^,^,,(^* n {Ft > k}) 

>rnAFt>k)-¥,i£^), 

where Pj^ $7 p denotes the coupling measure. In the second inequality, we have used the fact that 
for every pair of events A, B, we have ¥p{A n B) > ¥p{A) - Pp(i3'). 

By Proposition 14. 4[ Fp{£t) = 0{V~^), and so, for each t < r]V, we obtain 

Pp(|a(vo)| >k)> P^_^(F, >k) + OiV^'), 

which establishes f l4.4p . Similarly, for each time t and non-negative integer k, 

Pp(|G*(vo)| > k) > P^_p(F; >k)- ¥pi£^), 

and, in particular, for each t < riV, 

Pp(T >t)= Pp(|G,(vo)| > 1) > P^,p(F; > 1) + 0{V~'). 

Notice that, for t < rjV, 

P^,p(F; = 0) < Pn,p(i^rVl = 0) < ^nA^ < 

In this way we arrive at 

Pp(|a(vo)| >k)> F^^^iFt >k) + OiV-') 

>F^AFt>k,Fl>0) + OiV^') 

= p^,p(f; > 0) - p^,p(f; > o, < a:) + o{v-') 

> p^_^(f; > 0) - Fp{Bi{m,p) <k) + o{v-') 

> F^/F = 00) - Fpmm,p) <k) + 0{v-'), 

since on the event that the process is alive at time t and the event £t occurs, we can couple the 
number of vertices added at all steps until t so that it is at least as large as a sum of t independent 
binomials Bi{Q,p). 

Hence, also using the facts that p > 1/Q and 1^2 — r2| = o{Q), for every constant 6 G (0, 1) there 
is a constant a > such that 

Pp(|C(vo)| > (1 - 6)r]V^ > Pp(|C,y(vo)| > (1 - 6)vV) (4.15) 

> P^,p(F = 00) - Fp{BiirjVn,p) < (1 - 5)rjV) + OiV') 
= 2e + 0{r] + n-^ \ogn + e^) + e""'?^ + 0{V-^). 

But equally, we could run the exploration process until time (1 + 6)'r]V to obtain a cluster of size 
T]V whp, that is, we could use the above with 77 replaced by 77/(1 — 5) to obtain that 

Pp(|C(vo)| >r]V) > 2e + 0{r] + n~Hogn + e^) + e-'''^^ + 0{V~^). (4.16) 

This establishes ( 14.51) . thus completing the proof of Lemma 14. 3[ and hence also the proof of 
Proposition 12.11 ■ 
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Let US call a horizontal line good if it contains at least r]V/{4n) = rjn/A elements in C(vo) along 
that line, and bad otherwise. We shall now prove Proposition I2.4[ thus establishing a lower bound 
on the number of good lines. 



Pp^max max : A^(vo,z) < -rjn) = 0{V 

\ V() 1=1... .,n 4 / 



Proof of Proposition 2^ As earlier, for a vertex vq and i G {1, . . . the random variable 
Ct(vo,'i) denotes the number of elements of the i-th horizontal line contained in Ct(vo), the part 
of C(vo) obtained by running the exploration process until time t. Also, C(vo,i) is the number of 
elements of the i-th horizontal line in C(vo) and A^(vo,i) is the number of such elements included 
in C(vo) until time \rjV^. 

Let C2 be as in Proposition 14. 4[ statement (1), and choose C = C2. By Proposition 14.41 state- 
ment (1), we have 

5 

. _ , ^ . ^, „^ _ -5 

V() i=l,...,n 4 

We select a vertex vq. Let A be the event that {maxj=i^...^„ A^(vo, z) < jrjn}. Let also B ("S" 
for "bad") be the event that fewer than 3n/4 lines are good for the cluster C(vo). On the event 
that |C(vo)| > rjV, we have |Cp^vi(vo)| > |-Rr^vi(vo)| = \f]^]- It follows that we only need to show 
that 

{B n {|C(vo)| > vV}) =¥,{Bn {|Cr,,i(vo)| > vV}) = o{V-'). (4.17) 

Indeed, summing over all vertices vq we may deduce from fl4.17p that whp there is no vq such 
that |C(vo)| > rjV and fewer than 3n/4 lines are good for C{vq). In order to establish (14.1 7p . we 
shall show that 

¥piBn{\C,,y,ivo)\ > vV}) < ¥piA")- (4-18) 
Clearly, |C(vo,2)| > A^(vo,i) for every i. Let us write (yfvo and 6vo respectively for the number of 
good and bad lines in Cp^vi (vq)- 

On the event A, the explored cluster C|-^v'i(vo) at time rjV contains at most 5r]n/4 elements of 
every good line and at the same time has size at least rjV. Hence, using also that = n — b^^, on 

An{\c,,y,{Yo)\>vV}, 

5 1 1 

VV < |C.v(vo)| < -^Vng^o + -^Vnb^o = V^g^o + -^VV, 



which gives 



and hence 



3 

-T]V < r]ng^^ 



. 3 



In other words, on ^fl {jCp^v-i (vo)| > V^}^ number of good lines is at least 3n/4, which means 
that 

Fp{B nAn {|Cr,vi(vo)| > vV}) = 0, (4.19) 
and so establishes claim (14.181) . Then, from Proposition 14. 4[ we see that 

Fp{Bn{\C,,,,i^ro)\>r|V})<Fp{A') = 0{V-'), (4.20) 

as required. This completes the proof of Proposition 12.41 I 
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5. Concentration of measure for the number of vertices in large clusters 

This section contains our proof of Proposition 12.21 In outline, the goal is to establish concentra- 
tion of measure for 2'>^-2, for an appropriate choice of a -C e to be determined below. This will 
be carried out by second moment methods, in a slightly unusual way, as we explain now. 

For every define the centered versions of the random variables by 

= Z>,-Ep[Z>,]. (5.1) 

The entire proof revolves around two scales of magnitude, denoted by N_ and A^. The value is 
the large scale, and corresponds to Sq'^ in Proposition 12.21 The value N_ is the smaller scale, and 
will be determined below. The scales N_ and N are related through a positive integer I defined by 

N = m^. (5.2) 

With this notation, proving Proposition 12.21 amounts to establishing that 

Fp(|4^| >fev) =o(l). (5.3) 

We first observe that 

/-I 

|^>]v| ^ |^>iv| + ^ ^ |^>2'+ljV " ^>2'jv|- 

The goal is now to establish sufficient bounds on the variances of the above random variables, so 
that we can prove that Z^j^ is concentrated. For this, we choose a sequence {5j}^rQ such that each 

5i > and J2iZo ^ f- If \Z>n\ > 5eV, then either |Z>jv| > 5eV/2, or iZ^^i+i^ — Z-^2^n\ ^ ^i^V 
for some < z < / — 1. Consequently, 

i-i 

^p{\Z>jr\ > SeV^ < Pp(l4^| > SeV/2) + ^ - > S.eV^ (5.4) 

i=0 

and we are going to upper bound each term on the right hand side separately. Our argument 
relies on estimating the variance of Z>jv and those of the differences Z-^^^+ij^ — Z^^xj^. This is 
accomplished in Section 15.21 - see Lemmas 15.21 and Lemma 15. 3[ The variance estimates impose 
various restrictions on N_ and A^; in Section 15.31 we show that these can be satisfied as long as 
e^V ^ log n, which establishes Proposition 12.21 The key to the proof is to choose iV, and 
{<^i}to SO ^s ensure adequate concentration of measure. 

The remainder of this section is organised as follows. In Section 15.11 we bound the cluster tail 
bounds of the form Pp(|C(vo)| G [^, 2£]); these are needed to estimate the distribution of the 
random variables Z^^^+^n ~ ^>2»]v Here we shall make use of Galton- Watson processes estimates 
and comparisons established in Sections [SHU Then, in Section 15.21 we upper bound the variances 
of Z>iv and Z>2i+iiv ~ ^>2»jv Finally, in Section [531 we complete our proof of Proposition 12. 2[ 

5.1. Key ingredients. As before, for a positive integer A^ and an edge probability p, Pat^^ denotes 
the probability measure corresponding to the Galton- Watson process where the family size is a 
Bi(A^, p) random variable; also, F is the total progeny. 

The remainder of this section is devoted to establishing a bound on the cluster tail crucial to 
the arguments in Sections 15.21 and Section [531 Recall that = 2(r2 — 1), choose a positive integer 

and suppose that VL = Cl{n) satisfies Q — Cl = 0(logr;, + i/n) for some i = o{eV). Suppose 
further that e = e{n) = ilp — 1^0 such that V~^^'^ <^ e <^ 1 for V = sufficiently large. Then 
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|e — e| = p\Q — f2| = 0{\ogn/n + i/n^) = o{e), since i = o{eV), and so we may use the results of 
Proposition 13.11 Hence, as long as £ = o{eV), we have, uniformly in n, 

l^nAF >e)- >i)\<c{p\n-n\ + -L^ + ^y (5.5) 

We shall use inequality (15.51) in the following lemma to identify the cluster tail distribution more 
precisely. 

Lemma 5.1 (Bound on the cluster tail). Set p = Pc + fj- Let V~^^^ <^ e <^ 1, and let £ E N 
satisfy i < V'^l'^ and i <^ eV . Then there exists a constant C such that, for n sufficiently large, 

P,(|C(vo)|G[£,2£])<^. (5.6) 

Proof. By Lemma [4.11 

Pp(|C(vo)| >0 <Pn,p(F>^). 
Further, by Lemma 14.31 C can be chosen large enough that 

P,(|C(vo)| > 21) > F^JF > 2£) + OiV~'), 

where Q = Q — | max{£n^^, Clogn}. Let e = Clp — 1 and note that |e — e| < ^{i + nlogn) 
(after a suitable adjustment of C). Since e > V~^^^ = n~^/^, our assumptions on i imply that 
|£ — e| = o{e). By Propositions 13. II and 13.21 

F^JF > 2i) > FnAF > ^ + 0{\e - e\ + + ^) 
> ^nAF >i) + Oi\e- e\) + 0{r'/^). 

It follows that 

Pp(|C(vo)| >£)- Pp(|C(vo)| > 2£) < 0i\6 - e\) + Oir'/^), 

and we only need to show that \e — e] is 0(1^^^^). This is equivalent to showing that both and 
are 0(1-^^^). The condition i < V^^/s jg equivalent to 4 < r^/^. The bound ^ < £-V2 
holds when i < r? j (logn)^; as £ < V"^!"^ = n^/^, this is also true for n sufficiently large. I 

5.2. Variance estimates. Lemmas 15.21 and 15.31 below contain variance estimates essential to our 
proof of Proposition 12. 2[ 

Lemma 5.2 (Variance of the number of vertices in moderate clusters). Set p = Pc + fi- Suppose 
that V^^/^ < e < 1. Choose N such that N = o{VV) and N = o{e'^V). Then 

Varp(Z>,) = o((£\/)2). (5.7) 
Proof. First note that Varp(Z>jv) = Va,Tp{Z^pf), where 

Z^^ = V - Z^^ = J2n\CM\ < N]. 

V 

We expand Varp(Z<^r) as 

Varp(Z<^)= J2 [Pp(|C(vo)| < V,|C(vi)| < V) -Pp(|C(vo)| < V)2]. (5.8) 

VQ.Vl 
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We separate each term involving distinct Vq and Vi into two, according to whether or not Vi G 
C(vo). We can then write 

Varp(Z^<jv) = 'S'vg^vi + »S'vg«/vi; (5-9) 
where S',„„vi = ^.^^^^(A^), S^^^^^ = S^^^^^^N) and 

Svo^v, = Yl Pp(|C(vo)| < iV,vi e C(vo)), (5.10) 

vo,vi 

^vo^v, = Yl [Pp(|C(vo)|<iV,|C(vi)|<iV,vi^C(vo))-Pp(|C(vo)|<iV)2]. (5.11) 

vo,vi 

It is easily seen that 

= l-E4|C(vo)|/[|C(vo)| < N]], (5.12) 

and we upper bound 

N N 

E4|C(vo)|/[|C(vo)| < N]] = 5^P,(/ < |C(vo)| < AT) < ^P,(|C(vo)| > /) 

1=1 1=1 

^ 1 

where the last inequality follows from Lemma 14. 2[ It follows that 

= OiVNe + vVn) = oie^V), (5.14) 

provided = o{eV) and = o{e'^V'^). When e^V ^ 1 then eV <^ e^V'^, so only the first 
constraint on A^ is binding, i.e. S'^^^vi = o(e^V^^) as long as A^ = o{eV). 
To upper bound S^^^^^^ note that, by [H inequality (9.7)], 

S.,^.,<pJ2 Ep[|C(u)||C(v)|/[|C(u)| < A^,|C(v)| < Ar,v ^C(u)]], (5.15) 

where the summation is over all edges {u, v} of H{2,n). We can estimate this similarly to S^^^^^ 
above, and find that 

N 



^v„^v, < P 5^ 5^ Pp(/i < \C{n)\ <N,k< |C(v)| < AT, V ^ C{u) 

(u,v) h,h=^ 
N 

<pJ2Yl >^i,|C'(v)| >/2,v^C(u)). (5.16) 

(u,v) h,l2 = l 

Since v ^ C(u), |C(u)| and |C(v)| are each independently of one another stochastically dominated 
by the total progeny of a Bi{Q,p) Galton- Watson process. (To see this in more detail, think of 
first constructing the cluster of u, and subsequently construct the cluster of v in the smaller graph 
with C(u) removed.) Using Lemma [4. 2 [ we then see that, since Qp is bounded above as — > oo. 



^ 1 1 



<CV{eN + y/Ny <0{Ve^N^ + VN). (5.17) 
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Thus, as long as iV = o{W) and N = 0(5 V), 

S.^^.^=o{ieVy), (5.18) 

which completes the proof. ■ 

Lemma 5.3 (Variance of the number of vertices in intermediate clusters). Set P = Pc+ fi- Assume 
that V^^/^ < 5 < 1, and take N eN such that N < 1/^/3 ^nd N < eV . Then 



Proof. We have 

Varp(Z>, - Z>,,) = Pp(|C(vo)| G {N, 2N], \C{^r,)\ e (N, 2N]) - Pp(|C(vo)| G {N, 2N])'. 

vo,vi 

Once again we split the sum according to whether or not vi G C(vo), obtaining 

Varp(Z^>jv — Z>2n) = 'S'vg^vi + 'S'vov^vi; (5.20) 

where now 

S^o»^^ = E Pp(|C'(vo)| G (iV,2iV],Vi G C(vo)), (5.21) 

vo,vi 

S^o^^^ = E [Pp(|C(vo)|,|C(vi)| G (iV,2iV],vi ^C(vo)) -Pp(|C(vo)| G (iV,2iV])']. (5.22) 

vo,vi 

Just as in the proof of Lemma 15.21 

= VEp[\Civo)\I[N < |C(vo)| < 2N]] < CVVn. (5.23) 

But 

flogn V , N 



VVN 

'N 



and so Sv^^vi is bounded by the right hand side of ( 15.19^ . 
Dealing with S^^^^^^ requires more effort. Define 



:=Pp(^|C(vo)| G (iV,2iV],|C(vi)| G (iV, 2iV], Vi ^ C(vo) j -P,(|C(vo)| G (iV,2iV])^ (5.25) 

so that 

'S'vQ^Vi = ^ ^ Pvo, Vi- 



vo, VI 

Now rewrite 

Pvo.vi 



P,(|C(v„)| e (A'.W """'i""'"'" ' ^C'(v„)||C(v„)| . iN,2N] 

-P,(|C(v,)|e(iV,2iV|). 

Recall that A^(v, i) is the number of elements in the i-th horizontal line included in the cluster 
until time rjn^. The proof of Proposition 14.41 implies that there is some constant C > such that 
wvhp every v such that |C(v)| G (A^, 2N] satisfies 

r A^i 

N(^r,^) <C logn V — . (5.26) 
L n J 
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(To see this, think of running the exploration process with stopping time T as in (14. ip . where t] is 
defined by 2N = \r]V] (so in particular r] -C e). Since |C(v)| e (A^, 27V], we have C(v) = Cr(v).) 
Letting 

log n V — 

n . 



(5.27) 



n = n-2C 

we can lower bound 

P,(|C(vo)| G {N, 2N]) > F^^^iF > N) - FnA^ > 2iV) + 0{V). (5.28) 

Further, we can upper bound 

Pp(|C(vi)| e (iV,2iV],vi ^C(vo)||C(vo)| G (iV,2iV]) 

= Pp(|C(vi)| > iV,vi ^C(vo)||C(vo)| G (iV,2iV]) 

-Pp(|C(vi)| > 2iV,vi ^C(vo)||C(vo)| G (iV,2iV]) 

< Pf7,p(F > iV) - P^_p(F > 2iV) + 0(V-=^). (5.29) 

(Once again, to see this, think of first exploring the cluster of vq and, after that, the cluster of vi 
in H{2, n) with the cluster of vq removed.) 

Since < V^/^ and N < eV , we can use Lemma O to bound Pp(|C(vo)| G {N,2N]) and 
obtain 

< (Pn,p(F >N)- P^_^(F > AT) + Pn,,(F > 2Ar) - P^^^(F > 2Ar)) + 0{V-'). (5.30) 



By (15. 5p . with e = p^l — 1 and e = p^l — 1, for every £ G N, 

Pn,p(F >i)- F^^^iF > ^) < C (k - e1 + ^ + ^) . (5.31) 

Note that, by (K27\\ . 



\e-e\=C 



\ogn \ / N 



VS. (5.32) 



SO that we always have = 0{\e — e\). 

Consequently (with the value of C adjusted between inequalities), for all vertex pairs vq, vi. 



^>vo,v. <^(^k-^1 + ^j. (5.33) 

Summing over vq, Vi, 

= E ^vo.. ^ 7^ (I- - -I V ^) = V 17 V ]v^) ' (5-34) 



vo.vi 

smce e — e\ 



5.3. Proof of Proposition [2T2l We are now ready to complete the proof of Proposition 12.21 We 
will make essential use of Lemmas 15.21 and 15.31 The choice Eq in Proposition 12.21 will be given by 

^ = N, where N is determined below. 

Let 6 > 0, and, for i = 0, 1, . . . , J — 1, let 

6 
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The reasons for our choice for {5j}[~Q^ will become apparent shortly. For now let us note that 

i-i 

^5i<5/2. (5.36) 

i=l 

Recall the definition of Z>i from (15. ip and the decomposition in (15. 4p . We will prove that the 
right hand side of (15.41) is o(l) for suitable N_ and A^; the conditions that N_ and must satisfy 
are as follows: 

n (loffra)^ / \ 1/3 , , 

iV<r2, iV<eV, iV > ^ ° / , iV> , 5.37 

n-^e^ Vlogn/ 

iV < eV, N < y2/3. (5.38) 

Finally, Proposition O requires that iV > e"^. As e > {\ogVY/^V~^/^ = (log l^)^/^^^^/^, the 
choices N_ = ne^/^(log V^)^/^ and A^ = V^^/^ = n"^/^ clearly satisfy the bounds in fl5.37p - fl5.38p : thus 
we have proved that appropriate choices can be made. 

Let us note that it is here that the condition e ^ (logn)^/^V^~^/^ in Theorem 11.11 arises. We 
need to show concentration of measure for clusters of size N_, which satisfies the constraint N <^ n; 
for such clusters , we are unable to control very precisely the number of vertices per coordinate 
line (see Proposition 14. 4p - this then gives rise to the log n/n factor in Lemma 15.31 and hence at 
this point in our proof. 

We now prove that the concentration bound in Proposition 12.21 holds. By fl5.37p . N_ satisfies the 
hypotheses of Lemma 15.21 hence, using the Chebyshev inequality, 

P,(|Z,^| > 6eV/2) < ^^^^ = 0(1). (5.39) 

Denote Ni = 2*+^iV, and recall the relation between N and N in (Q- Since Ni < N, fl538D 
implies that Ni <^ eV and Nt < V'^l'^ for each i. Therefore, applying Lemma (5.31 to A^j = 2*+^A[ 
and using the Chebyshev inequality, we obtain 



^|Z>2»+liV — ■2'>2»jvl > ^i^^^j < i^^i^y) ^Varp(Z>2i+liV. — ^>2'jv) 



It follows that under our assumptions 



/_1 CV]_ I logn \j X-^ 



j=0 



Each term here is given by 



(5.42) 



v/AT {bisVy VAT 

By the last assumption in fl5.37p . for all i, > ]^ > so that the last term is never equal to 
the maximum. It follows that we need to upper bound 

7-1 CV^(\o^ w 

> ^eV) < oil) + J2 (5.43) 
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Letting m be the smallest i such that 



5^ < f . (8.44) 

n V 



we can write 

I-l CV^( ]2EJ1\/ E±\ m ^ , I-l 



V = V + V C;^. (5.45) 



j=0 \ ' / j=Q V t r i=m+l 



Using our definition of 6i in (15.351) . we can upper bound X]j=m+i by 



52 - 52 

j=m,+l ' i=l 



16C(2)2 



< ^}!h}±L2^/^^J2k'2~'/' < C2'/^^ = CVN. (5.46) 

A;=l 



Hence the second sum in (I5.45P is at most 



i=m+l 

We want the right hand side of (I5.47P to be o(l), which forces 

N = Nj<t: e^V^. (5.48) 

The bound in (I5.48P holds, since = o{eV), by the first constraint in (15.381) . and since e^V > 1. 
On the other hand, the first sum in (15.450 can be upper bounded by 



C log. ^ Clogn ^^^^^^ 



1=0 V t t V 



since 

K > by the third bound in flOTD . This proves the required concentration bound. 



thus establishing Proposition 12.21 and Theorem II. 1[ □ 
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